Ingeniería Biomédica
2024-08-12
In the example, in previous slide, data was modelled as a linear function. The difference (error) between the modelled data \(\left( \hat{y}_n \right)\) and actual data \(\left( y_n \right)\) can be written as
Cost function
\[E = \frac{1}{N} \sum_{n=1}^{N}{\left( \hat{y}_n - y_n \right)^2}\]
\[E = \sqrt{\frac{1}{N} \sum_{n=1}^{N}{\left( \hat{y}_n - y_n \right)^2}}\]
\[E = \frac{1}{N} \sum_{n=1}^{N}{\left| \hat{y}_n - y_n \right| }\]
Looking the cost surface, we notices that this surface has a global minimum. If we could have an algorithm which automatically finds it.
Cost Surface
Indeed, there are multiples algorithms for minima searching. The most famous is the one named as least squares but in this course we will use the gradient descent algorithm.
Assuming that the data model is a function \(f\left(\theta_i, x_n, y_n\right)\), where \(\theta\) is known as model parameter.
The gradient descent algorithm
\[\boldsymbol{\theta}_{i,j+1} = \boldsymbol{\theta}_{i,j} - \eta \frac{\partial E}{\partial \boldsymbol{\theta}_{i}}\]
Assumptions
\[\boldsymbol{\theta}_i = \left[ \theta_1, \theta_0 \right]^T\]
\[\hat{y}_n = \theta_1 x_n + \theta_0\]
\[E = \frac{1}{N} \sum_{n=1}^{N}{\left( \theta_1 x_n + \theta_0 - y_n \right)^2}\]
For \(\theta_1\) estimation
\[\boldsymbol{\theta}_{1,j+1} = \boldsymbol{\theta}_{1,j} - \eta \frac{\partial E}{\partial \boldsymbol{\theta}_{1}}\]
\[\frac{\partial E}{\partial \boldsymbol{\theta}_{1}} = \frac{\partial}{\partial \boldsymbol{\theta}_{1}} \left( \frac{1}{N} \sum_{n=1}^{N}{\left( \theta_1 x_n + \theta_0 - y_n \right)^2} \right) \]
\[\frac{\partial E}{\partial \boldsymbol{\theta}_{1}}= \frac{1}{N} \frac{\partial}{\partial \boldsymbol{\theta}_{1}} \left( \sum_{n=1}^{N}{\left( \theta_1 x_n + \theta_0 - y_n \right)^2} \right) \]
\[\frac{\partial E}{\partial \boldsymbol{\theta}_{1}}= \frac{1}{N} \sum_{n=1}^{N}{\frac{\partial}{\partial \boldsymbol{\theta}_{1}} \left( \left( \theta_1 x_n + \theta_0 - y_n \right)^2\right)}\]
\[\frac{\partial E}{\partial \boldsymbol{\theta}_{1}}= \frac{1}{N} \sum_{n=1}^{N}{2 \left( \theta_1 x_n + \theta_0 - y_n \right) x_n}\]
For \(\theta_0\) estimation
\[\boldsymbol{\theta}_{0,j+1} = \boldsymbol{\theta}_{0,j} - \eta \frac{\partial E}{\partial \boldsymbol{\theta}_{1}}\]
\[\frac{\partial E}{\partial \boldsymbol{\theta}_{0}} = \frac{\partial}{\partial \boldsymbol{\theta}_{0}} \left( \frac{1}{N} \sum_{n=1}^{N}{\left( \theta_1 x_n + \theta_0 - y_n \right)^2} \right) \]
\[\frac{\partial E}{\partial \boldsymbol{\theta}_{0}}= \frac{1}{N} \frac{\partial}{\partial \boldsymbol{\theta}_{0}} \left( \sum_{n=1}^{N}{\left( \theta_1 x_n + \theta_0 - y_n \right)^2} \right) \]
\[\frac{\partial E}{\partial \boldsymbol{\theta}_{0}}= \frac{1}{N} \sum_{n=1}^{N}{\frac{\partial}{\partial \boldsymbol{\theta}_{0}} \left( \left( \theta_1 x_n + \theta_0 - y_n \right)^2\right)}\]
\[\frac{\partial E}{\partial \boldsymbol{\theta}_{0}}= \frac{1}{N} \sum_{n=1}^{N}{2 \left( \theta_1 x_n + \theta_0 - y_n \right)}\]
\[ \begin{eqnarray} E & = & \frac{1}{N} \sqrt{u}\\ \frac{\partial E}{\partial \boldsymbol{\theta}_{0}} &=& \frac{1}{2 N \sqrt{u}} \frac{\partial u}{\partial \boldsymbol{\theta}_{0}}\\ \frac{\partial u}{\partial \boldsymbol{\theta}_{0}} &=& 2\sum_{n=1}^{N}{\left( \theta_2 x_{n}^{2} + \theta_1 x_n + \theta_0 - y_n \right)}\\ \frac{\partial E}{\partial \boldsymbol{\theta}_{0}} &=& \frac{2\sum_{n=1}^{N}{\left( \theta_2 x_{n}^{2} + \theta_1 x_n + \theta_0 - y_n \right)}}{2 N \sqrt{u}} \end{eqnarray} \]
\[ \begin{eqnarray} \frac{\partial E}{\partial \boldsymbol{\theta}_{0}} &=& \frac{\sum_{n=1}^{N}{\left( \theta_2 x_{n}^{2} + \theta_1 x_n + \theta_0 - y_n \right)}}{N \sqrt{\sum_{n=1}^{N}{\left( \theta_2 x_{n}^{2} + \theta_1 x_n + \theta_0 - y_n \right)^2}}}\\ \frac{\partial E}{\partial \boldsymbol{\theta}_{1}} &=& \frac{\sum_{n=1}^{N}{x_n \left( \theta_2 x_{n}^{2} + \theta_1 x_n + \theta_0 - y_n \right)}}{N \sqrt{\sum_{n=1}^{N}{\left( \theta_2 x_{n}^{2} + \theta_1 x_n + \theta_0 - y_n \right)^2}}}\\ \frac{\partial E}{\partial \boldsymbol{\theta}_{2}} &=& \frac{\sum_{n=1}^{N}{x_n^2 \left( \theta_2 x_{n}^{2} + \theta_1 x_n + \theta_0 - y_n \right)}}{N \sqrt{\sum_{n=1}^{N}{\left( \theta_2 x_{n}^{2} + \theta_1 x_n + \theta_0 - y_n \right)^2}}} \end{eqnarray} \]